-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: add model format for dpa1 #3211
Conversation
for more information, see https://pre-commit.ci
atype_embd = atype_embd_ext[:, :nloc, :] | ||
# nf x nloc x nnei x tebd_dim | ||
atype_embd_nnei = np.tile(atype_embd[:, :, np.newaxis, :], (1, 1, nnei, 1)) | ||
nlist_mask = nlist != -1 |
Check notice
Code scanning / CodeQL
Unused local variable Note
): | ||
dtype = PRECISION_DICT[prec] | ||
rtol, atol = get_tols(prec) | ||
err_msg = f"idt={idt} prec={prec}" |
Check notice
Code scanning / CodeQL
Unused local variable Note test
dd0.se_atten.mean = torch.tensor(davg, dtype=dtype, device=env.DEVICE) | ||
dd0.se_atten.dstd = torch.tensor(dstd, dtype=dtype, device=env.DEVICE) | ||
# dd1 = DescrptDPA1.deserialize(dd0.serialize()) | ||
model = torch.jit.script(dd0) |
Check notice
Code scanning / CodeQL
Unused local variable Note test
resnet=False, | ||
precision=precision, | ||
) | ||
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in] |
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class Warning
NativeLayer
) | ||
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in] | ||
if self.uni_init: | ||
self.w = 1.0 |
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class Warning
NativeLayer
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in] | ||
if self.uni_init: | ||
self.w = 1.0 | ||
self.b = 0.0 |
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class Warning
NativeLayer
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## devel #3211 +/- ##
===========================================
- Coverage 74.39% 20.72% -53.68%
===========================================
Files 345 346 +1
Lines 31981 32509 +528
Branches 1592 1594 +2
===========================================
- Hits 23791 6736 -17055
- Misses 7265 25075 +17810
+ Partials 925 698 -227 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serialize and de-serialize of the model_format/dpa1 should be tested.
variables = data.pop("@variables") | ||
embeddings = data.pop("embeddings") | ||
type_embedding = data.pop("type_embedding") | ||
attention_layers = data.pop("attention_layers", None) |
Check notice
Code scanning / CodeQL
Unused local variable Note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it pop and not used?
dd0_state_dict = dd0.se_atten.state_dict() | ||
dd4_state_dict = dd4.se_atten.state_dict() | ||
|
||
dd0_state_dict_attn = dd0.se_atten.dpa1_attention.state_dict() |
Check notice
Code scanning / CodeQL
Unused local variable Note test
dd4_state_dict = dd4.se_atten.state_dict() | ||
|
||
dd0_state_dict_attn = dd0.se_atten.dpa1_attention.state_dict() | ||
dd4_state_dict_attn = dd4.se_atten.dpa1_attention.state_dict() |
Check notice
Code scanning / CodeQL
Unused local variable Note test
data = copy.deepcopy(data) | ||
variables = data.pop("@variables") | ||
embeddings = data.pop("embeddings") | ||
type_embedding = data.pop("type_embedding") |
Check failure
Code scanning / CodeQL
Modification of parameter with default Error
default value
variables = data.pop("@variables") | ||
embeddings = data.pop("embeddings") | ||
type_embedding = data.pop("type_embedding") | ||
attention_layers = data.pop("attention_layers", None) |
Check failure
Code scanning / CodeQL
Modification of parameter with default Error
default value
Then the scaled dot-product attention method is adopted: | ||
|
||
.. math:: | ||
A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})=\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right)\mathcal{V}^{i,l}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need indents, otherwise, it cannot be rendered correctly. See https://deepmodeling--3211.org.readthedocs.build/projects/deepmd/en/3211/api_py/deepmd.model_format.html#deepmd.model_format.DescrptDPA1
variables = data.pop("@variables") | ||
embeddings = data.pop("embeddings") | ||
type_embedding = data.pop("type_embedding") | ||
attention_layers = data.pop("attention_layers", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it pop and not used?
w : np.ndarray, optional | ||
The embedding weights of the layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatch the actual parameters.
w : np.ndarray, optional | ||
The learnable weights of the normalization scale in the layer. | ||
b : np.ndarray, optional | ||
The learnable biases of the normalization shift in the layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatch the actual parameters.
This PR is merged into #3696 |
This PR add model format for DPA1 model:
TODO: